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Validation  Test  Report  for  a  Genetic  Algorithm  in  the  Glider  Observation  STrategies 
(GOST  1.0)  Project:  Sensitivity  Studies. 

1.  Introduction 

The  Environmental  Measurements  Path  Planner  (EMPath)  is  a  Genetic  Algorithm  (GA)  software 
that  has  been  developed  for  directing  sampling  platforms  (such  as  autonomous  ocean  gliders)  on 
preferential  paths  to  achieve  more  effective  coverage  or  transits  in  an  area  of  interest  (Heaney  et 
al.,  2007).  Glider  Observation  Sampling  Strategies  (GOST)  translates  a  glider  sampling  strategy 
into  criteria  for  evaluating  alternative  glider  paths  through  EMPath.  In  GOST  1 .0  optimal  paths 
are  designed  to  target  areas  of  large  model  forecast  uncertainty;  GOST  2.0  will  expand  mission 
criteria  to  include  area  coverage  and  searches  to  define  relevant  ocean  features.  By  using 
environmental  model  data,  EMPath  evaluates  alternative  sets  of  glider  instructions  by 
determining  the  resulting  glider  motion  subject  to  available  descriptions  of  currents  and  other 
variables  that  will  impact  the  glider’s  mobility.  EMPath  evaluates  the  resulting  sets  of  glider 
trajectories  relative  to  a  cost  function  that  quantifies  the  relative  benefit  expected  from  different 
sets  of  observations  and  identifies  the  most  effective  set.  Observations  from  gliders  directed 
according  to  the  EMPath  guidance  will  be  more  relevant  for  assimilation  into  the  real-time 
models  addressing  the  GOST-defined  mission  for  the  target  area.  Utilization  of  these  tools  for 
glider  placement  under  GOST  assists  the  Navy  in  optimizing  the  value  of  glider  observations 
while  reducing  manpower  requirements  (Memorandum  3100;  Memorandum  3140).  Future  goals 
of  increasing  the  number  of  gliders  in  a  Navy  observation  networks  will  only  be  manageable 
with  such  automation.  Using  the  GA  to  do  the  background  work  of  optimizing  glider  paths, 
particularly  with  the  longer  mission  time  frames  encompassed  by  GOST  2.0,  allows  the 
operational  center  to  adopt  a  proactive  approach  that  maneuvers  assets  through  changing  ocean 
currents.  This  will  insure  that  they  are  more  effective  in  sustaining  extended  support  for  mission 
objectives. 

EMPath  interfaces  with  the  Relocatable  Circulation  Prediction  System  (RELO)  which  has  been 
operational  since  August,  2008  (Rowley,  2010).  RELO  has  two  major  components:  1)  the  Navy 
Coupled  Ocean  Data  Assimilation  (NCODA)  (Cummings,  2005)  for  data  analysis  and  model 
initialization,  and  2)  the  Navy  Coastal  Ocean  Model  (NCOM)  (Barron  et  al.,  2006;  Martin  et  al, 

2009)  for  the  ocean  dynamics  prediction.  The  system  also  has  the  capability  of  performing 
ensemble  runs  initialized  by  the  Ensemble  Transform  and  forced  by  atmospheric  fields  perturbed 
by  the  space-time  deformations  method  (Hong  and  Bishop,  2007;  Coelho  et  al.  2010a). 

This  report  will  discuss  two  basic  sets  of  experiments  that  have  been  conducted  to  evaluate  the 
effect  of  EMPath  in  guiding  gliders  to  the  best  location  to  provide  feedback  to  an  ocean  model. 
The  first  set  performs  an  Observation  System  Simulation  Experiment  (OSSEs)  (Masutani  et  al., 

2010)  in  the  Okinawa  Trough.  In  this  approach  one  model  simulation  is  designated  to  be  the  true 
ocean,  often  called  the  nature  run,  from  which  data  can  be  extracted  and  assimilated  in  the  other 
simulations.  Another  run  is  identified  as  the  control,  a  run  which  employs  the  present  standard 
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observing  systems  and  assimilation  capabilities.  For  these  experiments,  the  inputs  for  the  GA  are 
derived  from  several  criteria  to  test  the  impact  of  providing  glider  guidance  from  station  and 
trajectory  variability  versus  the  forecast  uncertainty  as  derived  from  an  ocean  model  ensemble. 
The  goal  is  to  evaluate  the  skill  and  limits  of  each  approach.  The  second  set  of  experiments,  the 
Maritime  Rapid  Environmental  Assessment  of  2010  (MREA10)  is  a  true  real  time  exercise  with 
a  full  feedback  cycle  using  model  prediction  to  guide  the  glider  and  assimilating  the  collected 
data  back  into  the  model.  Although  not  a  Naval  exercise,  the  MREA10  was  used  to  exemplify 
the  data  flow  challenges  of  a  real-time  Naval  exercise  similarly  to  the  execution  of  such  a  system 
on  Naval  Oceanographic  Office  (NAVO)  computational  platforms. 

2.  Genetic  Algorithm  Background 

The  Environmental  Measurements  Path  Planner,  EMPath,  (Heaney  et.  al,  2007)  determines  an 
optimal  search  plan  for  a  network  of  inhomogeneous  sampling  platforms.  The  multi-objective 
cost  function  (CF)  to  be  minimized  is  a  linear  combination  of  individual  constituent  cost 
functions  (CCF).  These  CCF  contain  the  oceanography,  physics  and  Navy  mission  related 
information  which  the  expert-user  determines  drive  his  update  criterion.  Current  CCF  are  based 
upon  model  forecast  uncertainty,  ocean  temporal-spatial  variability  and,  if  user  interest  merits, 
ocean  acoustic  sensitivity  for  ASW  applications.  The  CF  for  a  specific  asset  laydown  strategy, 
E(r),  is  the  normalized,  weighted  sum  of  the  user  defined  constituent  cost  functions. 

Constraints,  such  as  these  include  boundary  constraints,  Cb,  including  bathymetry,  operational 
area  definition,  and  water  space  management.  For  multiple  vehicle  optimization,  the  distance- 
potential  constraints,  C(jp  are  used  to  keep  multiple  vehicles  apart  and  are  added  to  the  user 
defined  normalized  sum.  The  CF  is  then  expressed  as: 


/  »  W.C,(r)  ,  , 

£(7  =  1^7 TTT+’nc,  +wJpcJp  (?) 

<=1  a(ci) 

where  IT,  are  the  user  specified  weighting  functions  and  o  (Q  )  is  the  normalization  term  for  each 
cost  function.  The  normalization  term  is  the  rms  value  of  a  each  sample  cost  function,  such  that, 
the  large  differences  in  magnitude  of  the  multiple-cost  functions  can  be  accounted  for,  resulting 
in  a  non-dimensional  CCF. 

The  genetic  algorithm  (GA)  is  a  search  technique  for  solving  constrained  large-dimensional  non¬ 
linear  optimization  problems  (Goldberg,  1989).  The  GA  algorithm  has  been  successfully  applied 
to  many  of  these  problems,  including  for  example  geo-acoustic  inversion  in  underwater  acoustics 
(Gerstoft,  1994,  Gerstoft  and  Gingras,  1996).  The  algorithm  is  loosely  based  on  the  process  of 
natural  selection  in  evolutionary  biology.  A  gene  is  defined  as  a  vector  that  uniquely  determines 
a  parameter  of  the  search  space,  such  as  sensor  platform  deployment  coordinates.  Based  upon 
the  analogy  of  natural  selection  a  population  is  generated  from  a  random  sampling  of  a  particular 
gene  pool,  which  spans  the  multi-dimensional  search  space.  A  population  is  a  set  of  individuals, 
each  having  a  set  of  genes  specifying  a  unique  measurement  approach,  which  we  refer  to  as  the 
sensor  laydown.  For  example,  in  a  five-glider  problem,  each  individual  represents  a  different 
time/space  transect  pattern  for  five  gliders.  Beginning  with  an  initial  random  sampling  scheme 
(first  generation),  and  iterating  over  generations,  a  gain  over  the  cost  function  (or  fitness)  of  each 
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individual  is  computed.  Using  this  information,  fit  individuals  are  selected  and  mated  and  a  new 
generation  of  individuals  is  produced.  Unfit  individuals  (those  with  poor  fitness  values)  are  not 
reproduced.  A  random  crossover  of  parent  genes  generates  the  genes  of  the  children.  To  reduce 
the  probability  of  converging  to  a  local  cost-function  maximum,  a  small  fraction  of  random 
mutations  of  individual  genes  are  permitted  for  each  generation.  Reproduction  and  fitness 
testing  occurs  until  an  exit  criterion  is  met.  Example  exit  criteria  are  a  minimum  percentage 
change  in  the  fitness  function,  or  a  maximum  number  of  generations. 

EMPath  includes  a  simple  kinematic  model  for  each  platform,  moving  within  a  forecast  ocean 
velocity  field.  The  velocity  of  the  platform  is  added  linearly  to  the  forecast  ocean  current  vector 
as  a  function  of  time.  The  inclusion  of  ocean  current  in  the  generation  of  the  path  sampling 
vector  constrains  the  solution  space  to  searches  that  are  achievable  -  to  within  the  accuracy  of 
the  ocean  forecast  velocity  field. 

The  primary  purpose  of  the  morphology  figure  is  to  provide  a  display  for  the  user  with  a  level  of 
confidence  that  the  GA  has  indeed  guided  the  network  to  regions  where  the  constituent  cost 
functions  are  large.  The  morphology  computation  is  an  estimate  of  the  integration  of  the  cost 
function  which  the  Genetic  Algorithm  is  using  to  optimize  sensor  locations.  To  estimate  the 
shape  of  the  multi-dimensional  cost  function  (the  morphology),  a  glider  is  positioned  at  lat/lon 
grid  in  the  NCOM  forecast  ocean  field.  The  cost  function  is  estimated  for  a  glider  trajectory  due 
north,  due  east,  due  south  and  due  west  for  as  many  hours  as  specified  by  the  user 
(morph_hours).  The  score  for  the  4  trajectories  are  averaged  into  the  morphology  estimate  for 
that  location.  A  short  computation  time  leads  to  higher  resolution  figures,  with  less  horizontal 
averaging,  but  underestimates  the  temporal  dynamics  of  the  4D  cost  function.  A  longer  time 
morphology  computation,  smoothes  the  spatial  scales,  but  includes  the  CCF  at  later  times. 


2.1  Targeting  Observations  Using  Ensembles 

The  problem  of  adapting  the  best  location  for  deploying  mobile  observation  platforms  in  a 
dynamic  environment  is  often  called  the  adaptive  sampling  or  targeting  observation  problem. 
The  importance  of  this  topic  has  been  heightened  in  oceanic  applications  by  the  advent  of 
Underwater  Automated  Vehicles  (UAVs).  Planning  the  missions  of  these  platforms  includes 
updating  reference  way-points  on  regular  schedules  such  that  one  must  solve  the  adaptive 
sampling  problem  before  some  critical  decision  time.  For  this  purpose,  the  Target  Observations 
Using  Forecast  Uncertainties  (TOFU)  (Coelho,  2010b)  system  uses  a  method  applied  by 
Majumdar  et  al.  (2002)  to  adaptive  sampling  in  atmospheric  modeling  applications.  This 
technique  uses  the  ensemble  forecast  (Bishop  et  al.,  2001)  and  rapid  low  rank  solutions  of  the 
Kalman  filter  equations  to  solve  the  targeting  observation  problem.  The  enabling  technique 
Ensemble  Transform  Kalman  Filter  (ETKF)  allows  for  a  mapping  of  the  error  covariance 
through  time  and  space.  This  is  based  on  the  assumption  that  the  analysis  error  covariance  at  the 
observation  time  can  be  estimated  by  evaluating  the  reduction  for  each  feasible  grid  point  of  the 
ensemble  domain,  taken  as  a  single  profile  measurement,  through  a  range  of  selected  depths  (for 
the  present  example  0  to  1000m  to  reproduce  a  glider  profile  observation). 

The  first  step  of  this  method  is  to  identify  the  areas  of  interest  inside  the  simulation  domain 
hereby  referred  to  as  the  target  box.  A  forecast  time  called  a  verification  or  target  time  is  one  in 
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which  the  adaptive  supplemental  observations  taken  at  an  earlier  observation  time  will  produce  a 
maximum  effect  defined  by  a  fitness  computed  over  the  cost  function.  For  this  TOFU  version  all 
these  parameters  are  to  be  introduced  using  a  Graphical  User  Interface  (GUI).  The  cost  function 
to  be  minimized  is  derived  from  to  the  ensemble  forecast  variance  of  the  temperature  and  salinity 
and  the  parameters  computed  by  the  IAMPS  system  (Zingarelli  and  Fabre,  2009). 


2.2  Ocean  Variability  Cost  Functions 

There  are  operational  situations  where  due  to  limited  computational  resources  an  ensemble 
forecast  is  not  available.  For  these  situations  we  estimate  regions  of  model  uncertainty  using  a 
central  forecast  using  a  default  program,  datacx  (Heaney  et  al,  2012.)  It  is  assumed  that  regions 
of  stronger  dynamic  oceanography  will  be  correlated  with  regions  of  model  uncertainty. 
Certainly,  one  can  expect  the  converse  is  true:  regions  where  there  is  little  spatial  or  temporal 
variability  are  regions  where  the  model  uncertainty  is  expected  to  be  small(assuming  initial 
model  bias  had  been  removed  at  initialization).  To  this  end,  we  define  a  temporal  cost  function 
as: 


Cremporalir)  =  ((T(X’  U  Zref  ,  t)~  T 
(t  (x,  y,  zref  )  =  (t(x,  y,  zref ,  f ))  ) 

respectively,  where  the  temporal  rms  and  the  averages  are  taken  at  each  location  in  space  (for  a 
specified  zref)-  The  fitness  over  these  functions  is  computed  by  line  integration  over  the  possible 

glider  tracks  (r). 

Other  sets  of  functions  aimed  to  look  a  the  combined  space-time  variability  are  designed  directly 
from  the  state  variables  (or  other  fields  of  interest).  On  these  the  fitness  is  computed  by 
differentiation  over  the  trajectories  such  that  those  capturing  fronts  will  show  a  larger  skill.  (The 
EMPath  User’s  Manual  (Heaney  et  al,  2012)  has  a  more  complete  explanation  of  appropriate 
EMPath  and  datacx  usage  and  parameters.) 


3.  Observation  System  Simulation  Experiments  (OSSE’s)  in  the  Okinawa  Trough 

The  Okinawa  Trough  region  has  been  used  as  a  testbed  for  our  theoretical  setting  (Fig,l).  From 
August  to  November  2007  the  area  was  surveyed  extensively  and  used  to  populate  the  Naval 
Research  Laboratory  -  Stennis  Space  Center  (NRLSSC)  data  server  including  databases 
restricted  to  the  Department  of  Defense  (DoD)  obtained  by  NAVO,  as  well  as  public  data  sets. 
The  RELO  model  domain  configuration  is  similar  to  the  one  applied  to  assess  the  acoustic 
performances  predictions  (Rowley,  2010,  Rowley  et  al.,  2009,  Coelho  et  al.,  2010b).  After  an 
initial  spin  up,  the  experiments  were  concentrated  in  the  period  Oct  15-Nov  1,  2007.  The 
operational  area  (where  gliders  are  allowed  to  sample)  spanned  from  121-127°E  and  from  20- 
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27°N  and  the  target  area  (where  forecast  skill  is  expected  to  be  improved)  ranged  from  123- 
124°E  and  23-24°N.  Six  gliders  were  assumed  over  the  operational  area,  sampling  to  a 
maximum  glider  depth  of  1000m.  Each  EMPath  cycle  allowed  for  500  generations  and  100 
individuals  and  20  repetitions.  From  these  executions  the  EMPath  provided  a  set  of  hourly 
latitude  and  longitude  positions  for  each  glider  over  a  48-hour  period. 
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Fig.  1  The  experimental  or  operational  area(20-27°N,121-127°E)  is  shown  in  the  outer  box 
and  the  target  area  (23-24°N,123-124°E)  in  the  inner  box. 


3.1  Explanation  of  Numerical  Experiments 

A  RELO  run  assimilating  all  the  available  data  over  a  12hr  NCODA  cycle  is  designated  the 
Nature  Run  (natrun)  or  truth  (Fig.  2).  Several  variations  of  this  RELO  run  were  used  in  this 
VTR.  Unless  otherwise  specified  all  the  simulations  are  forced  by  the  Coupled 
Ocean/ Atmosphere  Mesoscale  Prediction  System  (COAMPS_wpac  2)  winds  (Hodur,  1997)  and 
heat  fluxes  from  0.5°  Navy  Operational  Global  Atmospheric  Prediction  System  (NOGAPS) 
(Hogan  and  Rosmond,  1991).  Several  criteria,  simulating  different  possible  applications  in  a 
real-time  scenario  were  applied  to  the  NCOM  fields  for  providing  CCF  to  the  EMPath 
algorithm.(Fig.  3).  When  data  assimilation  is  included,  the  NCOM  runs  provided  a  48  hr 
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forecast.  These  CCF  and  the  forecasted  ocean  currents  were  used  to  provide  hourly  waypoints 
for  the  glider  trajectories,  updated  every  48  hours.  Note  that  US  Navy  guidance  is  given  with  12 
hr  separated  waypoints,  but  for  this  experiment  we  assume  the  gliders  to  follow  recommended 
trajectories.  From  these  trajectories,  simulated  glider  observations  were  then  produced  by 
sampling  the  natrun  at  each  glider’s  location  and  times.  Our  goal  is  to  verify  that  the 
assimilation  of  the  glider  profiles  will  improve  the  model  performance. 

The  comma  separated  variable  file  provided  by  EMPath  (Heaney  et  al,  2012)  for  glider 
guidance  are  used  as  the  input  for  a  NCOM  post-processing  routine  to  extract  simulated  profiles 
of  temperature  and  salinity  from  the  nature  run’s  output.  For  a  temporal  interpolation,  the 
simulated  profiles  are  at  every  hour  and  every  following  hour  (one  hour  is  the  time  interval  of  the 
NCOM  output  files) . 


The  different  cases  are: 

•  Control  Run  (contrun)  is  the  benchmark  case.  It  includes  data  assimilation  of  all 
available  data  from  the  NCODA  directories,  but  no  profiles  from  gliders  or  from 
NCODA. 

•  Case  A  which  will  be  referred  to  as  ok_free  is  a  free  run  (ie  no  data  assimilation).  The 
model  is  forced  by  atmospheric  fields  and  restated  by  the  previous  day  simulations. 

The  previous  two  cases  are  variations  in  model  spinup  and  initial  conditions.  The  following 

cases  refer  to  different  manipulations  of  the  model  outputs  to  make  cost  functions. 

•  Case  B  is  the  run  with  32  ensemble  members.  From  the  ensemble  mean  and  variability, 
TOFU  created  the  NetCDF  file  that  serves  as  input  to  EMPath.  The  TOFU  GUI  creates 
a  summary  map  fromf  RELO  NCOM  acoustic  and  tactical  ensemble  members  which  is 
referred  to  as  an  ETATM  file,  which  also  includes  a  set  of  22  CCF  in  temperature  (T), 
Sonic  Layer  Depth  (SLD),  and  below  layer  gradient  (BLG).  The  summary  maps  identify 
relative  impacts  of  each  grid  point  if  sampled  independently  in  reducing  the  forecast  error 
over  the  target  area. 

•  Case  C  uses  a  cost  function  based  on  the  error  between  the  RELO  forecast  and  the 
natrun.  An  absolute  temperature  difference  file  is  generated  between  the  natrun  and  the 
NCOM  forecast  over  the  48  hours  of  fields: 

Error  =  I  T(x,y,Z,t)  natnm  ~T(x,y,Z,t,)  NCOM  forecast  I. 

These  differences  are  normalized  over  the  forecast  period  and  fed  into  a  datacx  pre¬ 
conditioner  (Heaney  et  al,  2012).  The  NCODA  processing  in  the  simulations  runs  daily 
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using  this  absolute  temperature  differences  instead  of  the  usual  error  between  analysis 
and  forecast  (Rowley,  2010).  This  represents  a  perfect  cost  function  which  though  not 
possible  operationally,  could  be  extrapolated  to  areas  where  there  are  known  simulation 
errors. 

•  Case  D  uses  the  RELO  forecast  temperature  and  salinity  variability  to  compute  the  cost 
function.  This  is  the  backup  for  operational  areas  to  received  information  on  ocean 
dynamics  when  not  running  ensembles.  In  some  areas  of  interest,  the  executing  of 
ensembles  is  not  always  feasible  due  to  computer  resource  limitations.  The  datacx  pre¬ 
conditioner  reads  the  files  over  the  forecast  period  and  normalizes  the  variability  to 
calculate  a  CCF.  The  NCODA  is  based  on  the  error  between  analysis  and  forecast  fields. 

•  Case  E  is  a  lawnmower  case  which  would  closely  resemble  an  array  deployment  of 
gliders.  Gliders  are  initialized  to  a  starting  position  and  given  a  bearing  to  go  from  west 
to  east  and  back.  EMPath  was  only  used  as  a  kinematic  solver  to  confirm  the  feasibility 
of  straight  line  glider  paths  in  the  presence  of  the  forecast  ocean  model. 


For  each  Case,  we  have  conducted  parallel  simulations  as  summarized  in  Table  1: 

•  Case  B-E  henceforth  as  referred  as  to  as  the  standard  set.  The  models  are  initialized  by 
the  contrun  and  include  assimilation  of  all  data.  Gliders  are  directed  by  EMPath.  We 
also  have  a  twin  experiment,  CaseBr,  to  verify  the  impact  of  different  correlation  length 
scales  in  the  data  assimilation  and  model  performances,  as  it  will  be  discussed  at  a  later 
section 

•  Case  Bf-Ef:  models  are  initialized  by  the  free  run  and  during  the  two  week  comparison 
time  include  assimilation  of  all  data.  The  purpose  of  the  free  running  initial  conditions 
following  the  free  running  startup  was  to  compare  an  addition  of  these  capabilities  to  a 
more  simple  simulation.  The  simulated  profiles  from  the  natrun  over  the  glider  paths 
were  not  recalculated.  The  gliders’  travel  paths  are  from  standard  case  (ie  EMPath  was 
not  re-executed).  These  simulations  are  henceforth  as  referred  as  to  ICfree. 


•  Case  Bp-Ep:  models  are  initialized  by  the  contrun  and  assimilate  only  the  profiles  (no 
surface  data  was  assimialted)  with  the  goal  of  isolating  the  gliders’  impact.  Even  though 
other  profile  data  are  present,  the  glider  data  will  still  provide  the  bulk  of  the  assimilation 
depending  on  the  configuration  of  gliders.  As  in  the  previous  alteration,  EMPath  was  not 
rerun.  This  simulations  are  henceforth  referred  as  to  as  profDA. 
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Table  1:  Glider  treatment  vs.  pre-processing 


Initial  Condition/Data  Assimilation 

Glider  Treatment 

-IC/contrun 
-all  available  data 

-IC  free  run 
-all  available  data 

-  IC  contrun 
-/glider  profiles  only/no 
surface  data 

Ensemble  spread 

Case  B 

Case  Bf 

Case  Bp 

Known  Forecast  Error 

Case  C 

Case  Cf 

Case  Cp 

NCOM  Forecast 

Case  D 

Case  Df 

Case  Dp 

Lawnmower 

Case  E 

Case  Ef 

Case  Ep 

No  assimilation 

ok_free 

No  gliders/all  other 
available  surface  data 

contrun 

No  gliders/all  other 
available  surface  and 
profile  data 

natrun 

The  horizontal  interpolation  of  glider  data  in  the  form  of  a  profile  is  bi-  linearly  interpolated  to 
the  nearest  point  to  map  these  data  onto  the  model  grid.  The  vertical  interpolation  algorithm  is 
done  using  a  Piecewise  Cubic  Hermite  Interpolating  Polynomial  (Fritsch  et  al.  1980,  Kahaner  et 
al.  1988)  that  preserves  the  profile  shape,  retains  the  monotonicity  and  matches  the 
maximum/minimum  values  at  the  points  of  the  original  field.  The  latter  properties  may  have 
some  effect  on  our  evaluation  as  it  will  be  discussed  in  a  following  section.  A  subsequent  script 
shapes  interpolated  profiles  in  a  descending-ascending  triangular  path  at  four  minutes  intervals. 
This  is  accomplished  with  a  linear  vertical  interpolation  between  each  hourly  waypoint  and  the 
waypoint  for  the  following  hour  to  simulate  the  glider  movement.  Therefore  the  whole  procedure 
requires  a  great  amount  of  interpolation  that  definitely  may  compromise  our  evaluation. 
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Sea  Surface  Temperature  (C) 
RELO  NCOM 
11-08-2007  01Z  0000  m 


120°E  124°E  128°E  132°E 


Fig.  2  A  snapshot  of  the  Nature  run  or  “truth” 

VTR  Case  Studies  in  Okinawa  Trough 


Fig.  3  The  case  studies  in  the  Okinawa  Trough  area 
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Figure  4  is  an  example  of  differ  EMPath  best  waypoint  solutions  to  illustrate  a  typical  glider 
guidance  delivery.  The  glider  paths  are  placed  over  the  morphology  All  cases  are  best  solutions 
of  the  highest  level  of  confidence  of  large  CF  for  2  November  2007  and  are  plotted  on  different 
scales.  This  illustrates  the  different  emphasis  of  the  various  approaches  over  the  operational 
area. 

The  high  impact  area  is  indicated  by  the  red  in  the  morphology.  For  Case  B,  the  TOFU  provides 
a  clear  area  of  high  morphology.  The  case  B  gliders  can  travel  outside  of  the  target  area,  but 
will  remain  in  the  operational  area  Figure  4b  illustrates  Case  C  glider  behavior.  Case  C,  while 
having  more  information  available  than  is  realistic,  looks  for  known  maxima  in  the  operational 
area.  The  gliders  movement  however,  is  severely  limited  as  seen  by  the  almost  nonexistent 
tracks.  Case  D’s  color  scale  (Figure  4c)  illustrates  that  it  does  not  determine  obvious  maxima 
over  the  operational  area,  but  it  does  produce  a  robust  coverage.  Case  E  as  shown  in  Figure  4d, 
allows  the  gliders  to  attempt  ideal  lawnmower  paths  back  and  forth  over  the  operational  area. 
The  gliders  show  a  slow  drift  but  try  to  follow  a  man-in-the-loop  determined  path. 


10 


N26°30'- 


Okinawa-ji 


Taipei 


anzhou 


Miyako-ji 


TaictTung^ 


"tshigaki'shlma 


Cha^ji'uav 


Taiwan 
jYunlini  I  )  \i 

WrSn, (f\£.  i 


££19130} 


■£W£n3Qftea'  n  e  e  r- 


Ka  oh  siting 


Pingtung 


-©^l^in'g^ay-Ltd. 


|  ©2011  Mapabc.com 
©2011  Europa'Techpologi 


©  ZOlljGoogUT 


23*30’25.22*  N  123°58‘58.15"  E  elev  -8097  ft 


1 

0.9 

0.8 

0.7 

0.6 

0.5 

0.4 

0.3 

0.2 

0.1 

0 


a) 


d) 


Fig.  4  Examples  of  glider  paths  from  Case  B(a),  Case  C  (b),  Case  D  (c)  and  Case  E(d)  for  2  November  2007. 
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3.2  Results 


Our  goal  was  to  evaluate  how  the  assimilation  of  the  glider  simulated  profiles  affect  the 
prediction  of  the  acoustic  properties.  Therefore,  our  analysis  was  focused  on  the  representation 
of  the  SLD,  that  corresponds  to  a  key  variable  determining  the  skill  in  trapping  sound  waves  in 
the  upper  ocean  (Helber  et  al,  2010).  Fig.  5  illustrates  a  day,  1  November  2007,  in  which  the 
natrun  has  a  preponderance  of  low  SLD  and  the  free  running  case  has  mostly  higher,  highlighting 
an  extreme  disagreement  needing  a  correction.  The  obvious  red  bias  in  the  free  running  case 
(b);  as  opposed  to  the  more  blue  areas  in  the  nature  run  (a),  illustrate  areas  where  glider 
improvements  might  be  easily  visible.  This  particular  date  is  chosen  for  comparison  because  the 
error  was  obvious  to  the  naked  eye. 

The  control  run  case  and  standard  glider  cases  all  show  improvement  over  the  free  running  (Fig. 
5c-Fig.  5g).  The  plots  of  the  free  running  initial  conditions  show  the  glider  impacts  more  clearly 
(Fig.  5h  -  Fig.5k).  Case  Ef  appears  to  have  the  most  overall  impact  due  to  coverage.  With 
glider  profiles  only  (Fig.  51  -  Fig.5o),  the  data  assimilations  of  the  surface  fields  does  not  occur. 
This  makes  it  possible  to  see  the  slicing  of  the  gliders  into  the  red  areas  with  the  correct  “blue” 
data.  Realistically,  there  would  usually  be  other  NCODA  data  available.  The  glider  only 
experiments  were  an  additional  check  to  observe  the  glider  data  impact.  Fig.  51-5o  show  the 
slicing  of  the  gliders  to  introduce  the  blue  (lower  SLD)  into  the  formerly  red  area. 

These  plots  were  created  to  initially  assess  the  impact  of  any  data  over  a  free  running  case.  All 
of  the  assimilated  cases  make  some  improvement  over  the  free  running  case. 
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SLO.  CaseBf.  2007110100 
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SLO.  CaseBp.  2007110100 
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SLD,  CaseCf,  2007110100 


SLO.  CaseCp.  2007110100 


Fig.  5  Sonic  Layer  Depth  plots  for  1  November  2007 
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Fig  6  illustrates  the  mean  and  RMS  error  of  SLD  between  the  different  cases  and  the  natrun  (true 
ocean)  as  a  function  of  time  over  the  target  area  for  the  standard  cases.  The  mean  and  RMS 
were  calculated  over  the  target  area.  The  results  didn’t  provide  the  expected  improvement  and 
no  case  appeared  manifest  as  the  ‘best’  of  the  approaches.  We  have  therefore  questioned  the 
OSSE  configuration.  From  our  analysis,  two  major  limitations  have  emerged:  1)  the  true  and 
simulated  oceans  were  too  similar  for  an  effective  application  of  NCODA,  and  2)  the  extensive 
interpolation  applied  for  extracting  the  glider  profiles. 

One  of  the  major  parameter  in  NCODA  is  the  specification  of  the  correlation  length  scale  (ie  the 
radius  of  influence  for  the  assimilated  profile).  In  our  cases,  since  no  substantial  differences  were 
in  the  true  and  assimilated  ocean,  the  correlation  scale  deteriorated  (rather  than  improving)  the 
solution  in  the  proximity  of  the  profiles.  This  is  most  likely  due  to  interpolation  and  introduction 
in  various  forms  to  NCODA.  To  verify  the  validity  of  this  assumption,  Case  B  was  re-run  with  a 
smaller  (and  unrealistic)  correlation  length  scale  (henceforth  as  referred  as  to  as  CaseBr).  As  Fig 
6  indicates  the  error  is  sensibly  reduced.  The  horizontal  and  vertical  interpolation  applied  to 
extract  the  simulated  glider  profiles  has  also  a  major  impact.  As  previously  discussed,  the 
vertical  interpolation  preserves  the  maximum  and  minimum  values  at  the  original  depths. 
Therefore,  the  thermocline  depth,  already  poorly  represented  by  the  NetCDF  coarse  z-levels  (ie 
the  standard  Generalized  Digital  Environmental  Model  (GDEM)  72  levels)  was  conserved  in  the 
interpolated  profiles  and  small  variations  in  the  vertical  field  may  have  lead  to  large  difference  in 
the  SLD  computations. 
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Fig.  6  The  mean  (a)  and  RMS  (b)  error  over  the  target  area  as  function  of  time  for  standard 
cases. 


Fig.  7  presents  the  SLD  error  time  series  for  the  ICfree  cases.  The  glider  cases  make  positive 
impact  from  the  beginning  of  the  simulation  period  over  the  control  run.  As  with  the  standard 
cases,  no  one  case  appears  to  be  clearly  best,  though  Case  B  does  well  overall.  From  these 
calculations  Case  B  and  Case  D  would  do  as  well  as  the  man  in  the  loop  lawnmower  type  case. 
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Fig.  7  The  mean  (a)  and  RMS  (b)  error  over  the  target  area  as  function  of  time  for  free 
running  initial  condition  (ICfree)  cases. 


Fig.  8  shows  the  ProfDA  SLD  error  comparisons.  The  control  run  as  expected  would  have  the 
lowest  errors,  resulting  from  the  additional  data.  Case  Cp  clearly  has  the  highest  error.  This  may 
be  the  result  of  restrictions  on  glider  movement  to  high  error  areas,  as  was  seen  in  Fig,  4b.  Case 
Bp  has  a  very  low  error  overall  which  shows  the  success  of  the  adaptive  sampling  as  if  focuses 
on  the  target  area  and  the  gliders  cooperate. 
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Fig.  8  The  mean  (a)  and  RMS  (b)  error  over  the  target  area  as  function  of  time  for  glider 
profile  data  (ProfDA)  assimilation  cases. 
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We  have  also  investigated  the  effect  of  the  procedures  on  the  detection  skill  and  prediction  of 
surface  ducting  by  trapping  sound  at  frequencies  of  600Hz  and  higher.  An  analysis  was  done 
using  Matlab  routines  created  at  NRLSSC  by  Robert  Helber  to  calculate  the  cutoff  frequency 
(COF)  (Helber,  20 10 Assigning  a  reference  frequency  of  600  Hz,  comparisons  were  made 
between  the  COF  of  the  natural  run  and  the  COF  of  the  each  case.  If  the  nature  run’s  COF  was 
greater  than  the  600Hz,  and  the  case  study’s  was  less,  the  case  study  had  a  false  positive 
indicated  by  red.  A  false  negative  (yellow)  occurred  when  the  nature  run  predicted  trapping  and 
the  case  study  did  not..  A  true  positive,  indicated  by  green,  occurs  when  both  experiments 
predicted  trapping  at  less  than  600Hz;  and  conversely  a  true  negative  (white)  when  both  the  case 
study  and  the  nature  run  predict  no  trapping  (Table  2).  Fig.  9  depicts  an  example  of  stop-light 
plot  over  the  target  area. 


Table  2  COF  color  scheme 


Model  <  600Hz 

Model  >600hz 

True  <  600Hz 

Green  true  positive 

Yellow  false  negative 

True  >600Hz 

Red  false  positive 

Blue  or  white  true  negative 
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Fig.  9  An  example  of  an  SLD  comparison  between  an  experiment  and  the  natural  run; 
where  red  is  false  positive,  yellow  is  false  negative,  green  is  true  positive,  and  white  is 
true  negative. 

Initially,  the  comparisons  were  made  at  analysis  time  00.  The  investigation  of  behavior  at  00  for 
the  OSSE’s  illustrated  the  effects  of  gliders  in  an  area  over  time  influencing  model  analyses  over 
many  days.  While  the  stoplight  plots  may  present  a  spatial  indication  of  false  vs.  true 
predictions,  they  do  not  allow  us  to  quantify  the  impact  of  the  glider  data.  Counts  were  added  to 
the  Matlab  program  and  the  results  organized  in  histograms  representing  time  00Z  (Fig.  lOa-f). 

Whereas  red  and  yellow  are  incorrect  (false)  results,  and  blue  and  green  are  correct  (true)  (Table 
2),  the  red  and  green  are  the  most  obvious  in  the  histogram  plots.  As  we  step  through  the  two 
week  simulation  period  it  is  obvious  that  the  results  are  not  consistent  throughout.  The  free 
running  case  (Fig.  10a)  develops  some  predominantly  green  (true)  results  around  October  20, 
2007  and  it  continues  to  improve  toward  the  end  of  the  simulation  which  is  unexpected. 
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Fig.  10  A  histogram  to  quantify  the  true  vs.  false  variations  along  the  nineteen  days  of  the  comparisons. 


In  Fig.  10,  the  control  run  case  shows  some  obvious  blue  and  green  results  peeking  out  from 
behind  the  incorrect  red  bars  which  are  decreasing  from  their  values  in  the  free  run.  The  data 
assimilation  makes  a  noticeable  improvement  for  all  cases  on  the  first  day. 

Case  C,  (Fig.  lOd)  where  the  actual  errors  are  actually  known  shows  more  improvement  than 
Case  B  at  day  one,  as  expected,  due  to  the  knowledge  of  exact  error.  In  Case  D,  (Fig  lOe),  the 
results  are  still  favorable.  Toward  the  end  of  the  two  week  period,  the  true  representation  occurs 
more  often  than  the  false  one.  The  Case  E  (man  in  the  loop  lawnmower),  while  showing  some 
superiority  towards  the  end  of  the  19  days,  is  not  significantly  better.  (Fig.  lOf)  This  suggests 
that  assuming  the  gliders  spaced  at  approximately  equal  distance  and  continues  coverage  would 
not  be  more  beneficial  than  a  genetic  algorithm  to  zoom  in  onto  problem  areas. 

The  other  sets  of  simulations  ProfDA  (Case  Bp-Ep)  and  ICfree,  (Bf-Ef )  have  similar  behavior. 

To  further  quantify  the  COF  true  vs.  false  comparisons,  the  green  and  blue  (true)  values,  and  the 
yellow  and  red  (false)  values,  respectively,  were  averaged  over  the  19  day  simulation  period..  A 
straight  subtraction  of  true  minus  false  gives  an  indication  of  correct  minus  incorrect  values  (the 
optimal  scenario  would  have  zero  incorrect  values). 


Fig.  1 1  illustrates  the  difference  calculation  of  true  and  false  assessments  over  the  length  of  the 
run.  The  y-axis  is  the  number  of  ( true-  false)  points  The  highest  values  indicate  the  most 
correct.  A  negative  value  indicates  more  incorrect  than  correct  values.  The  x-axis  is  the  day  of 
the  model  run  from  15  October  2007  through  2  November  2007.  The  upward  trend  of  the  graph 
demonstrates  the  improvement  over  the  two  week  simulation 
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Fig.ll  A  running  calculation  of  the  difference  between  true  and  false  calculations  of  cutoff 
frequency  for  the  original  OSSE’s  in  the  target  area. 
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Fig.12  A  running  calculation  of  the  difference  between  true  and  false  calculations  of  cutoff 
frequency  for  the  glider  profile  OSSE’s  in  the  target  area. 
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Fig.13  A  running  calculation  of  the  difference  between  true  and  false  calculations  of  cutoff 
frequency  for  the  glider  profile  OSSE’s  in  the  target  area. 


22 


Table  4  presents  a  calculation  of  the  mean  of  the  differences  of  the  initial  runs  over  the  two  week 
period.  The  percentage  correct  are  calculated  with  a  simple  formula: 

#true  —  #  false 

- x  100 

#true  +  # false 

The  largest  positive  numbers  indicate  that  the  cutoff  frequency  is  more  correct  when  compared 
with  “truth.”  Case  D  is  the  winner  in  this  set,  which  is  interesting  as  the  calculation  is  only  done 
over  the  target  area.  Therefore,  Case  D  can  be  a  viable  option  for  efforts  in  areas  where  an 
ensemble  run  is  not  possible. 

Case  Bp  has  the  best  result  from  the  ProfDA,  which  is  expected  as  the  calculation  is  over  the 
target  area.  This  indicates  that  for  Case  B  the  glider  data  is  being  phased  or  masked  out  with  the 
presence  of  other  data  from  NCODA. 

Case  Bf  has  the  best  result  for  the  ICfree  cases.  Introducing  the  glider  data  has  the  strongest 
result  with  the  presence  of  ensembles,  but  Case  Df  also  has  a  strong  showing  indicating  that  the 
model  forecasts  are  still  an  improvement 


Table  4:  %  Differences  over  Target  Area  (Time=00Z  Analysis) 
Mean  Difference  True  vs.  False 


Treatment 

Ensemble 

True 

Error 

Forecast  Error 

Lawnmower 

Free 

Control 

Standard 

61.3 

62.8 

63.6 

62.4 

53.3 

62.8 

Glider 

Profile  Data 
Assimilation 

63.3 

56.4 

60.4 

58.3 

Free 

Running 

IC’s 

65.7 

60.9 

62.1 

63.9 

In  order  to  measure  forecast  skill,  the  same  comparison  of  true  COF  minus  false  COF  were 
performed  for  the  48hr  forecast  Fig.  14  shows  the  same  increasing  trend  as  was  seen  with  the  00 
hour  analyses.  The  glider  data  improves  improves  the  results  over  the  two  week  period.  Table  5 
takes  the  mean  of  the  differences  over  the  nineteen  days  and  shows  a  close  result  for  Case  B  vs. 
Case  E.  Case  Br  shows  a  very  poor  result.  Limiting  the  NCODA  radius  may  not  have  allowed 
for  data  to  gravitate  to  capture  features  that  evolved  over  the  forecast  time. 
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Fig.14  A  running  calculation  of  the  difference  between  true  and  false  calculations  of  cutoff 
frequency  48  hour  forecasts  for  the  original  control  run  spinup  OSSE’s  in  the  target  area. 


In  the  cases  where  only  glider  profiles  were  available  for  data  assimilation,  the  glider  profiles 
alone  do  not  fill  out  the  whole  field  as  well  as  other  data.  Fig.  16  shows  a  large  spread  between 
the  performances  of  all  of  the  cases.  The  Case  Dp  result  with  model  data  gives  a  better 
performance  than  the  Case  Bp.  As  with  the  Case  Br  limiting  the  range  of  the  data  over  the 
forecast  period,  Case  Bp  may  also  be  too  limiting  in  area  coverage  for  the  glider  data  alone.  The 
Case  Cp  known  error  case  does  best  where  profiles  only  are  available,  but  that  is  to  be  expected 
given  a  known  forecast  error. 

The  ICfree  cases  (Fig.  17)  show  the  highest  values  in  the  upward  trend  over  the  nineteen  days 
and  the  highest  mean  value  for  the  nineteen  days.  The  Case  Bf  is  the  best  OSSE  mean  value  by  a 
large  amount.  The  value  of  the  TOFU  ensemble  based  cost  functions  makes  a  continued  impact 
when  other  cases  are  not  as  strong.  The  lawnmower  case  especially  drops  off  toward  the  end  of 
the  19  days  probably  from  the  gliders  not  capturing  important  features  as  the  forecast  evolves. 
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True  -  False  48  hr  forecast 
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Fig.16  A  running  calculation  of  the  difference  between  true  and  false  calculations  of  cutoff 
frequency  48  hour  forecasts  for  the  glider  profile  only  OSSE’s  in  the  target  area. 


True  -  False  48  hr  forecast 
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Fig.17  A  running  calculation  of  the  difference  between  true  and  false  calculations  of  cutoff 
frequency  48  hour  forecasts  for  the  free  running  spinup  OSSE’s  in  the  target  area. 
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Table  5:  %  Differences  over  Target  Area  (Time=48hr  Forecast) 
Mean  Difference  True  vs.  False 


Treatment 

Ensemble 

True  Error 

Forecast  Error 

Lawnmower 

Control 

Standard 

62.0 

62.7 

60.7 

62.1 

60.2 

Glider 

Profile  Data 
Assimilation 

52.7 

57.0 

55.5 

56.3 

Free 

Running 

IC’s 

65.7 

60.2 

60.5 

57.7 

3.3  Conclusions 

The  simulated  true  ocean  experiment  did  not  provide  the  expected  results  and  we  could  not 
definitely  prefer  one  criterion  for  the  cost  functions  over  the  others.  The  simulations  have  been 
highly  affected  by  NCODA  performances.  NCODA  is  effective  when  there  are  large  (realistic) 
errors  between  the  assimilated  data  and  forecasted  (i.e.,  the  NCODA  background)  field.  In  our 
case,  the  natrun  and  contrun  (i.e.,  the  initial  condition  for  all  experiments)  simulations  are  too 
similar.  Since  NCODA  extends  the  influence  of  the  assimilated  profile  on  a  radius  determined  by 
the  correlation  length  scale,  it  introduces  distortions  in  the  surrounding  area  of  the  assimilated 
profile  when  the  background  field  is  the  too  close  to  the  true  ocean.  The  problem  with  the 
correlation  length  scale  and  the  too  similar  background  field  is  confirmed  by  the  CaseBr  where 
the  solution  is  improved  by  reducing  the  correlation  length  scale  (ie  the  radius  of  influence  of  the 
assimilated  profiles).  Therefore,  the  cases  starting  with  the  free  run  have  the  better  performance 
because  the  data  assimilation  is  effective  in  correcting  the  background  field  and  after  the  initial 
adjustment  the  results  are  comparable  with  the  contrun.  Finally,  we  should  not  discharge  the 
influence  of  the  extensive  interpolation  in  simulating  the  glider  descending  and  ascending  path. 

Moreover,  the  Okinawa  Through  region  is  characterized  by  high  variability  and  internal  wave 
propagations  so  that  small  perturbations  may  lead  to  large  phase  differences  in  the  propagating 
fields  and  in  the  acoustic  parameters.  Overall,  introduction  of  glider  data  does  improve  the 
OSSE  performance.  This  is  supported  by  the  upwards  trends  of  the  positive  difference  in  the 
charts.  More  analysis  needs  to  be  done  on  the  specific  use  of  this  data  with  NCODA.  The 
current  system  of  man  in  the  loop  glider  guidance  can  still  be  utilized.  However,  as  shown  in 
Tables  4  and  5,  the  GA  can  outperform  the  man  in  the  loop  scenario.  This  will  be  especially 
useful  in  situations  where  many  gliders  exist  and  a  more  automated  approach  is  warranted. 
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4.  MREA  10 


The  MREA_10  aims  to  exploit  remotely  sensed  satellite  data  for  (1)  extraction  of  near  surface 
geophysical  parameters,  (2)  utilization  of  a  fleet  of  gliders  (AUVs)  to  map  out  the  physical  and 
bio-optical  properties  in  the  water  column  prior  to  and  during  the  cruise,  (3)  deployment  of 
drifters  and  HF  radar  to  determine  turbulent  transport  and  dispersion,  (4)  deployment  of 
moorings  to  initialize  and  set  boundary  conditions  for  atmospheric  and  oceanic  models  and 
finally  (5)  assimilation  and  fusion  of  all  data  into  bio-optical  and  physical  METOC  models, 
providing  an  integrated  approach  for  near  realtime  METOC  data  collection  and  modeling.  The 
exercise  is  focused  on  the  littoral  zone  that  is  very  dynamic  and  the  most  difficult  area  to 
accurately  retrieve  remotely  sensed  geophysical  parameters.  As  part  of  the  project,  the  NATO 
Undersea  Research  Centre  (NURC)  supported  a  cruise  in  the  Ligurian  Sea  to  sample  similar 
areas  as  two  previous  trials:  the  Ligurian  Sea  Cal/Val  2008;  LSCV’08  (Oct  2008)  and  the 
Battlespace  Preparation  2009  (Mar  2009).  Slocum  coastal  gliders  were  deployed  before  and 
during  the  cruise,  but  during  the  trial  there  have  been  of  an  effort  to  sample  specific  areas  in  a 
‘glider  fleet’  mode.  In  support  of  the  modeling  efforts,  drifters  were  deployed  to  track  turbulent 
transport  and  dispersion  in  the  study  area.  Two  moorings  also  were  deployed  to  assist  in 
initialization  and  set  boundary  conditions  for  the  models  and  the  HF  radar  study:  one  mooring 
was  south  of  Portofino  and  the  other  off  Palmaria  Island. 

4.1  The  Real  Time  Exercise 

NRLSSC  participated  in  the  MREA_10  providing  in  realtime  a  full  feedback  cycle  using  model 
prediction  to  guide  the  glider  and  assimilating  the  collected  data  back  into  the  model.  The  goal 
was  to  verify  how  an  ‘intelligent’  guidance  of  the  gliders  would  improve  the  forecast  skill  of  the 
model.  The  NRLSSC  modeling  effort  was  based  on  the  RELO  system.  Fig.  18  illustrates  the 
model  configuration  which  consisted  of  3  nested  domains  forced  by  the  COAMPS_europe3 
surface  forcing  (Hodur,  1997)  and  Open  Boundary  Conditions  (OBC)  extracted  from  the 
simulations  of  the  parent  domain.  The  OBC  for  the  outer  most  nest  were  extracted  from 
G8NCOM.  Monthly  river  discharges  were  from  the  global  river  data  set  of  1/8°  Global  NCOM 
(G8NCOM)  (Barron  and  Smedstad,  2002),  with  the  Arno,  Magra,  and  Serchio  transports 
provided  by  the  Istituto  Idrografico  Italiano.  The  vertical  resolution  of  each  domain  had  40  c- 
and  10  z-levels  (50  levels).  The  outer  nest,  nestO,  was  at  4km  horizontal  resolution  with  the 
primary  purpose  of  serving  as  a  buffer  zone  between  G8NCOM’s  NOGAPS  forcing  and  the 
higher  resolution  wind  data  set.  Nest  1  (2km  resolution)  included  tides.  Tides  were  specified  at 
the  boundaries  from  the  Oregon  State  University  tide  model  (Egbert  and  Erofeeva,  2002).  An 
ensemble  of  32  independent  runs  of  nestl  was  also  made  available  in  realtime.  The  simulations 
were  initialized  by  the  Ensemble  Transform  Kalman  Filter  (Bishop,  et  al.,  2001)  using 
atmospheric  forcings  perturbed  by  the  space-time  deformations  method  (Hong  and  Bishop, 
2007).  Nest2  was  about  0.6km  resolution  and  configured  for  the  operational  area.  While  data 
assimilation  was  performed  on  both  nestO  and  nestl;  nest2  was  a  free  run  (ie  the  effects  of  data 
assimilation  are  through  the  OBC).  Then,  nest2  may  be  considered  as  a  dynamical  interpolation 
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of  nestl  and  also  served  as  benchmark  for  model  evaluation  and  comparison.  Data  were  daily 
retrieved  from  both  the  NRLSSC  and  NURC  data  servers. 


Fig.18  The  triple  nest  configuration  for  MREA_10. 

Although  5  gliders  were  operating  during  the  exercise,  guidance  was  suggested  for  and  data 
assimilated  from  glider  LAURA  only.  This  was  to  verify  the  impact  of  few  data  aimed  to 
improve  the  forecast  in  a  pre-defined  area  (ie  the  target  area)  vs  a  broad  range  of  data  collected  to 
cover  a  larger  area  (ie  the  operational  area).  In  order  to  gather  more  realtime  data  for  NCODA, 
the  analysis  time  was  set  at  -24  hours  and  the  temperature  and  salinity  innovations  inserted 
between  -24  and  0  hrs  (henceforth  the  0  hours  is  referred  as  to  as  the  OGMT  of  the  current  day  of 
the  realtime  operations).  From  the  analysis,  96  hours  of  effective  forecast  (ie,  120 
computational  forecast)  and  ensemble  runs  were  provided  and  the  results  processed  in  NetCDF 
files  posted  on  a  user/password  protected  web  page.  The  forecast  and  ensemble  mean  and  rms 
were  also  incorporated  in  the  super-ensemble  approach  that  was  run  in  parallel  at  NURC 
(Mourre  et  al.,  2010).  The  cost  function  for  the  GA  was  derived  from  the  forecast  field  and 
ensemble  variability  and  the  48  hour  guidance  path  shared  with  the  NURC  glider  pilots. 

The  main  issue  of  realtime  operations  is  a  timely  delivery  of  the  results.  In  this  exercise  we  had 
about  6hr  from  the  availability  of  the  forecast  atmospheric  fields  to  the  dateline  for  sending  the 
glider  path  guidance  for  the  next  cycle.  To  speed  up  the  beginning  of  the  simulations,  the  -24hr 
G8NCOM  full  forecast  fields  was  used  to  provide  OBC  to  nestO.  The  full  modeling  and  data 
model  outputs  processing  cycle  was  performed  at  the  NRLSSC  on  dual  64  processors  Opteron- 
based  LINUX  platforms. 

The  model  simulations  started  on  July  1st  2010  to  assess,  calibrate  the  model  configuration  and 
to  verify  the  realtime  practical  applicability.  The  interactions  with  glider  LAURA  were  from 
August  20-28  2010.  Fig  19a  depicts  the  surface  velocity  and  temperature  field  for  Aug  25  and 
Fig  19b  the  associated  ensemble  spread.  In  MREA_10,  EMPATH  used  cost  functions  based  on 
weighted  sums  of  different  constituents  including  ensemble  spreads  and  specialized  acoustic 
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parameters.  (Fig  20)  (Coelho  et  al,  2010b).  Finally,  Fig  21  illustrates  the  full  LAURA  track  and 
the  delivered  paths  as  computed  by  EMPATH. 


Fig.  19  The  surface  velocities  over  the  temperature  field  (a)  and  ensemble  STD  for  Aug  25. 
Operational  and  target  areas  are  delimited  in  white,  and  black,  respectively. 


As  outlined  in  section  2,  two  sets  of  constituent  cost  functions  were  used  in  the  REP  10  adaptive 
sampling  experiment.  The  first,  uses  a  method  based  upon  the  Extended  Transform  Kalman 
Filter  (ETKF)  to  determine  regions  where  measurements  maximally  reduce  the  forecast 
uncertainty  for  a  wide  range  of  observables.  TOFU  products  are  generated  for  reduction  in 
uncertainty  in  temperature,  as  well  as  acoustic  parameters:  Below  Layer  Gradient  (BLG),  In- 
Layer  Gradient  (ILG),  and  Sonic  Layer  Depth  (SLD).  For  this  test  only  Temperature  was  used. 
The  second  set  was  based  upon  the  ensemble  spreads  as  well  as  temporal  and  spatial  variability 
of  the  model  forecast  temperature  field.  The  TOFU  CF  Temperature  figure  is  in  the  first  column, 
fourth  row  of  Fig.  20.  The  ensemble  spreads  at  0,  25  and  100m  are  shown  in  the  upper  row.  The 
spatial  variability  of  the  mean  T  field  at  the  3  specified  depths  is  in  the  second  row,  indicating 
much  more  spatial  variability  at  0  and  25m  compared  with  100m.  The  temporal  variability  of  the 
ensemble-mean  field  is  shown  in  the  3rd  row.  The  final  CF  for  the  uniformly  weighted 
combination  (all  Is)  of  all  10  CCF  is  shown  in  the  lower  right  panel. 
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Fig.  20  The  input  fields  to  the  cost  function  used  by  EMPath. 


Operationally,  EMPath  was  run  daily  with  a  48  hour  forecast.  Optimal  glider  paths  were 
provided  to  NURC  daily,  although  the  navigation  was  updated  every  other  day.  The  GA  was  run 
with  500  individuals  for  80  generations.  In  order  to  check  convergence  and  uniqueness  of 
solution  it  was  run  5  times  with  different  random  seeds.  In  order  to  represent  results  to  the  user, 
an  estimated  CF  morphology  is  computed.  To  compute  this  function,  a  glider  is  positioned  at 
each  point  in  the  spatial  grid  and  samples  the  multi-dimensional  cost  function  for  3  hours  going 
North,  East,  West  and  South.  These  are  averaged  to  generate  a  value  of  the  weighted  CF  at  this 
point.  The  solutions  are  plotted  over  the  CF  morphology  to  provide  the  user  with  confidence  in 
the  result.  Note  that  the  morphology  is  only  a  sparse  sampling  of  the  extensive  time-dependent 
cost  function.  The  weighting  of  cost  functions  was  tapered  with  time.  Initial  weightings  favored 
the  TOFU  CF-Temp  function  due  to  the  emphasis  on  a  target  area.  As  time  progressed  and  the 
target  area  became  less  critical,  the  relative  weighting  of  the  TOFU  CF  was  reduced. 
Specifically  for  August  20-23,  the  weighting  went  from  12/6/0  for  the  TOFU  with  1  for  the  6 
ensemble  spread  CF.  This  corresponds  to  a  relative  weighting  of  2-1,  1-1  and  0-1.  The  5  best 
solutions  are  plotted  on  the  CF  morphology  in  Fig.  21. 
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Fig.  21  Morphologies  and  5  best  GA  solutions  for  Aug  20,  22  and  24.  Upper  left  is  8/24 
with  a  2-1  TOFU  weighting.  Upper  right  is  8/22  with  a  2-1  weighting,  lower  left  is  the  same 
day  with  a  1-1  weighting.  The  lower  right  panel  is  8/24  with  a  0-1  weighting. 

To  illustrate  the  ability  of  EMPath  to  generate  glider  paths  that  are  achievable  within  the  context 
of  dynamic  ocean  currents,  the  left  panel  of  Fig.  22a  below  shows  the  input  sample  guidance 
(colored  lines)  overlayed  on  the  actual  Laura  position  vehicles  for  the  6  days  of  the  test.  For  the 
first  day  (red)  the  guidance  started  late,  so  there  is  a  mismatch  in  the  guidance  vs.  the  actual 
positions.  Beyond  the  north-east  corner  on  day  1  (red),  the  sampling  guidance  is  exceptionally 
well  executed  by  Laura.  The  5  GA  solutions  for  August  24  are  shown  in  the  right  panel. 
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Ion 

Fig.  22  a)  The  real  track  of  Laura  (black)  and  the  different  tracks  delivered  to  NURC  (colored  lines).  The 
points  A  and  B  indicates  the  starting  and  ending  point  of  Laura  full  path,  respectively,  b)  The  morphology 
function  and  the  best  5  runs  for  day  Aug  24-26. 

During  the  realtime  operations,  some  preliminary  evaluation  and  validation  were  conducted 
within  the  limits  of  the  available  not-quality-controlled  raw  data.  Fig  23  illustrates  how  the  data 
assimilation  corrects  the  position  of  an  eddy  on  the  northern  side  of  the  target  area.  Both  figures 
have  identical  time  stamp,  but  different  forecast  hours  with  respect  to  two  different  model  cycles. 
Interactions  with  NURC  confirmed  the  presence  of  the  eddy  at  the  position  indicated  by  Fig  23 
(Alvarez,  personal  communication). 


78hr  forecast  from  2010082100  6hr  forecast  from  2010082400 

Fig  23.  Snapshot  images  of  surface  velocities  over  temperature  for  2010082406  as 
forecasted  a)  at  the  early  stages  of  assimilating  the  glider  profiles,  August  21st  and  b)  after 
a  few  assimilating  cycles,  August  24th. 
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Fig.  24  indicates  how  the  forecast  errors  were  dramatically  reduced  in  the  target  area  as  the 
LAURA  data  were  inserted  in  the  model  forecast  cycle.  We  have  computed  the  RMS  error 
between  the  data  collected  inside  the  target  area  and  the  profile  from  the  closest  points  (in  time 
and  space)  of  the  cycle  first  48hr  forecast.  The  data  for  the  comparison  were  not  assimilated  in 
the  day  of  the  evaluation.  As  expected,  the  inner  higher-resolution  free  nest  is  initially  more 
accurate  than  the  outer  nest,  but  as  the  assimilation  of  the  glider  data  starts,  the  outer  nest  errors 
are  reduced  and  a  few  assimilating  cycles  are  needed  for  transmitting  the  correction  into  the 
inner  domains. 
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Bad  profiles 

Fig.  24  The  max  and  mean  RMS  value  of  the  error  between  observations  and  model 
solutions  in  the  target  area .  Nestl  (black),  nest2  (red). 


4.2  The  Reanalysis 

After  the  completion  of  the  realtime  exercise,  two  parallel  experiments  were  conducted 
simulating  the  ‘realtime’  mode:  1)  assimilating  data  collected  by  all  the  5  gliders  (henceforth 
referred  as  to  as  Allgliders  or  Allgl)  and  2)  assimilating  all  but  LAURA  data  (henceforth  referred 
as  to  as  Nolaura  or  NoLr).  The  simulations  assimilating  LAURA  only  are  henceforth  referred  as 
to  as  Realtime  or  Realt.  Fig  25  illustrates  the  number  of  profiles  assimilated  in  each  cases.  The 
difference  in  number  is  quite  relevant  and  it  should  be  taken  into  consideration  when  comparing 
and  evaluating  the  3  experiments.  Unfortunately  we  had  no  control  on  the  observations 
accepted  by  NCODA.  As  Fig  25  indicates  Allgl  is  not  assimilating  all  the  Realt  profiles  nor  all 
NoLr  profiles,  but  it  is  important  to  note  that  Realt  is  assimilating  one  order  of  magnitude  less 
profiles  than  the  other  cases. 
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Fig  25.  The  number  of  profiles  assimilated  in  the  experiments,  i)  Allglider  (green),  ii) 
Realt  (blue),  iii)  NoLr  (red),  iv)  Realt  +  NoLr  (dashed  black  line)  v)  Allgl  -  NoLr  (dashed 
green  line). 

Data  for  the  new  experiments  have  been  retrieved  during  the  operations  at  sea;  therefore,  they 
have  the  same  no-quality-control  issues  of  the  Realt  simulations,  leaving  NCODA  to  discharge 
‘bad’  measurements.  On  the  other  hand,  the  new  model-data  comparison  has  been  conducted 
with  the  quality-controlled  observations  that  NURC  has  made  available  after  the  conclusion  of 
the  exercise.  This  new  data  set  also  includes  the  CTDs  collected  by  the  NRN  Alliance. 

Fig.  26  compares  one  profile  with  the  solution  of  two  different  forecast  cycles  and  Fig  27  (right) 
depicts  the  RMS  errors  as  function  of  depth  at  an  early  stage  of  the  assimilation  (Aug  21st)  and 
after  a  few  assimilating  cycles  (Aug  25th).  As  expected,  at  the  beginning  all  nests  have  similar 
error  distribution  with  the  inner  high-resolution  nests  being  more  accurate.  However,  as  the 
assimilation  continues,  the  error  of  the  outer  nest  is  reduced  with  a  greater  gain  in  the  upper 
levels.  The  Allglider  experiment  is  also  more  accurate  at  the  thermocline.  We  can  deduce  that 
Realtime  lacks  of  information  outside  the  area  sampled  by  LAURA  that  can  propagate  the 
analysis  correction  inside  the  target  area  and  NoLaura  experiment  lacks  of  observations  aimed  to 
improve  the  forecast  in  the  target  area. 
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Fig  26.  Model-data  comparison  for  CDT  1597  at  (9.28E,  43.7N)  on  Aug  25,  06:58.  a)  cycle 
2010082400  forecast  31hr,  b)  cycle  2010082500  forecast  7hr.  Nestl  (solid  line)  and  Nest2 
(dashed  lines).  Realt  (blu),  Allgl  (green),  NoLr  (red). 


NCOM  v*.  CTO  RMS  Temperature  during  period  2010  09  21  0:46 


NCOM  vs.  CTO  RMS  Temperature  during  period  2010  00  25  0  40 


Fig  27.  The  RMS  error  between  the  model  and  the  profile  collected  in  the  target  area  in 
the  the  first  48  forecats  . 

One  of  the  main  issues  in  these  data  assimilation  experiments  is  how  to  remove  the  bias, 
especially  at  the  lower  depths  un-sampled  by  the  gliders.  To  evaluate  the  impact  of  the 
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assimilation  on  correcting  the  background  bias,  we  have  computed  the  distribution  of  the  error 
between  data  and  models  (Fig  28  a-c)  and  then  computed  the  best  fit  5th  order  polynomial,  P5,  of 
each  histogram  (Fig.  28d).  Let  xmax  and  xef0id  the  points  such  as:  P5(xmax)=max  and  P5(xef0id)= 
0.5  P5(xmax).  Therefore,  xmax  and  xef0id  are  representative  of  the  background  bias  and  decay  of 
the  error,  respectively.  Fig  32  illustrates  the  evolution  of  the  mean  (over  depth)  bias  during  the 
exercise.  Toward  the  end  of  the  trial,  very  few  data  were  collected  inside  the  target  area  and 
after  Aug  26th  the  graphic  may  not  be  statistically  representative.  Hower,  it  is  evident  that  the 
background  bias  is  sensibly  redured  for  all  the  experiments  and  all  domains. 


Realt- 2010-08-25  0:48  AIIGIider  -  2010-08-25  0:48 


Nolaura  -  2010-08-25  0:48  RMS  distribution  -2010-08-25  0:48 


Fig.  28  The  errors  between  data  and  model  48hr  forecast  for  day  August  25th:  nestl  (blue) 
and  nest2  (green),  a)  Realt  b)  Allgl,  c)  NoLr,  and  d)  best  fit  of  the  histograms  on  a  5th  order 
polynomial.  See  text  for  explanations  of  terms 
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Fig  29.  The  background  bias  of  the  model  simulations  as  function  of  time.  Nestl  (solid  line) 
Nest2  (dashed  line).  See  text  for  definition  of  terms. 


4.3  Conclusions 

We  have  conducetd  a  realtime  exercise  in  which  the  ocean  model  forecast  and  ensemble 
variability  served  as  input  to  and  a  Genetic  Algorithm  to  provide  guidance  to  gliders  and  the 
collected  data  were  assimilating  back  into  the  model.  Even  though  5  gliders  were  deployed 
during  the  trial,  only  one  glider,  LAURA,  was  guided  and  her  (?)  collected  data  assimilated.  The 
experiment  was  quite  successful  indicating  the  practical  feasibility  of  the  procedure  and  how  an 
‘intelligent’  sampling  can  sensibly  reduce  the  forecast  errors  in  a  target  area. 

We  also  have  conducted  parallel  experiments  assimilating  all  the  available  gliders  and  all  but 
LAURA  data.  In  general,  the  Allgl  has  the  best  performance  and  assimilating  more  data  in  the 
operational  area  and  the  NoLaura  high  number  of  profiles  reduces  the  errors  and  the  correction 
from  the  data  are  propagating  inside  the  target  area.  On  the  other  side  an  aimed  and  ‘intelligent’ 
guidance  of  only  one  glider  provide  the  same  kind  of  accuracy  with  at  least  one  of  order  of 
magnitude  of  collected  data. 

5.  Summary 

This  document  is  designed  to  evaluate  the  impact  of  the  EMPath  Genetic  Algorithm  in  the 
adaptive  sampling  strategies  to  direct  and  guide  gliders  during  realtime  operations.  EMPath  has 
been  successfully  interfaced  with  the  RELO  forecast  system  and  applied  with  several  criteria  and 
approaches  in  defining  the  driving  cost  function.  The  validation  tests  have  been  designed  to 
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verify  the  skills  and  limits  of  the  several  approaches  and  document  the  results  of  a  realtime 
exercise,  MREA  10. 


Altough  the  OSSE  experiment  did  not  provide  a  clear  indication  of  the  differences  between  the 
several  approaches,  it  cannot  be  forgotten  that  application  of  each  criteria  should  take  into 
consideration  the  goals  and  aims  of  the  operation.  The  ensemble  approach  is  most  indicated  in 
realtime  operations  in  limited  areas  where  there  is  a  clear  need  of  improving  the  forecasting  skill 
of  the  model  in  a  limited  area  and  reducing  the  errors  in  derived  variables  such  as  the  acoustic 
properties  of  the  area.  The  MREA  10  is  a  clear  demonstration  where  few  data  from  a  well- 
directed  glider  had  comparable  impact  of  assimilating  many  more  observations. 

When  the  goal  is  mainly  to  improve  the  forecast  at  a  meso/regional  scale,  it  may  be  preferable  to 
adopt  the  less  computational  intensive  approach  based  on  the  forecast  error.  Finally,  the 
lawnmower  approach  is  well  suitable  for  long  terms  surveys  in  areas  where  the  impact  of  the 
assimilated  data  may  propagate  well  outside  the  operational  area. 
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8.  Acronyms 


Acronym 

Description 

asce 

American  Standard  Code  for  Information  Interchange 

CFL 

Courant  Fredrich  Levy  scheme 

CF 

Cost  Function 

CCF 

Constituent  Cost  Function 

COAMPS 

Coupled  Ocean  Atmosphere  Mesoscale  Prediction  System 

COF 

Cutoff  Frequency 

DoD 

Department  of  Defense 

EMPath 

Environmental  Measurements  Path  Planner 

ETKF 

Ensemble  Transform  Kalman  Filter 

G8NCOM 

1/8°  Global  NCOM 

GA 

Genetic  Algorithm 

GDEM 

Generalized  Digital  Environmental  Model 

GOST 

Glider  Observation  Sampling  Strategies 

GUI 

Graphical  User  Interface 

IAMPS 

Integrated  Acoustic  Multienvironmental  Processing  System 

ILG 

In-Layer  Gradient 

METOC 

Meteorological  and  Oceanographic 

MREA10 

Maritime  Rapid  Environmental  Assessment  of  2010 

NAVO 

Naval  Oceanographic  Office 

NCODA 

Navy  Coupled  Ocean  Data  Assimilation 

NCOM 

Navy  Coastal  Ocean  Model 

NOGAPS 

Navy  Operational  Global  Atmospheric  Prediction  System 

NRL 

Naval  Research  Laboratory 

NRLSSC 

Naval  Research  Laboratory  -  Stennis  Space  Center 

NetCDF 

Network  Common  Data  Form 

NURC 

NATO  Undersea  Research  Centre 

OBC 

Open  Boundary  Conditions 

OSSE 

Observation  System  Simulation  Experiment 

RELO 

Relocatable  Circulation  Prediction  System 

RMS 

Root  Mean  Square 

SLD 

Sonic  Layer  Depth 

TOFU 

Target  Observations  Using  Forecast  Uncertainties 

UAVs 

Underwater  Automated  Vehicles 

VTR 

Validation  Test  Report 
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